智能论文笔记

ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation

Zhaocong Li , Xuebo Liu , Derek F. Wong , Lidia S. Chao , Min Zhang

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-08

Transfer learning is a simple and powerful method that can be used to boost model performance of low-resource neural machine translation (NMT). Existing transfer learning methods for NMT are static, which simply transfer knowledge from a parent model to a child model once via parameter initialization. In this paper, we propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer knowledge from the parent model during the training of the child model. Specifically, for each training instance of the child model, ConsistTL constructs the semantically-equivalent instance for the parent model and encourages prediction consistency between the parent and child for this instance, which is equivalent to the child model learning each instance under the guidance of the parent model. Experimental results on five low-resource NMT tasks demonstrate that ConsistTL results in significant improvements over strong transfer learning baselines, with a gain up to 1.7 BLEU over the existing back-translation model on the widely-used WMT17 Turkish-English benchmark. Further analysis reveals that ConsistTL can improve the inference calibration of the child model. Code and scripts are freely available at https://github.com/NLP2CT/ConsistTL.

translated by 谷歌翻译

Improving Simultaneous Machine Translation with Monolingual Data

Hexuan Deng , Liang Ding , Xuebo Liu , Meishan Zhang , Dacheng Tao , Min Zhang

分类：自然语言处理

2022-12-02

Simultaneous machine translation (SiMT) is usually done via sequence-level knowledge distillation (Seq-KD) from a full-sentence neural machine translation (NMT) model. However, there is still a significant performance gap between NMT and SiMT. In this work, we propose to leverage monolingual data to improve SiMT, which trains a SiMT student on the combination of bilingual data and external monolingual data distilled by Seq-KD. Preliminary experiments on En-Zh and En-Ja news domain corpora demonstrate that monolingual data can significantly improve translation quality (e.g., +3.15 BLEU on En-Zh). Inspired by the behavior of human simultaneous interpreters, we propose a novel monolingual sampling strategy for SiMT, considering both chunk length and monotonicity. Experimental results show that our sampling strategy consistently outperforms the random sampling strategy (and other conventional typical NMT monolingual sampling strategies) by avoiding the key problem of SiMT -- hallucination, and has better scalability. We achieve +0.72 BLEU improvements on average against random sampling on En-Zh and En-Ja. Data and codes can be found at https://github.com/hexuandeng/Mono4SiMT.

translated by 谷歌翻译

CVR-LSE: Compact Vectorization Representation of Local Static Environments for Unmanned Ground Vehicles

Haiming Gao , Qibo Qiu , Wei Hua , Xuebo Zhang , Zhengyong Han , Shun Zhang

分类：机器人

2022-06-14

根据一般静态障碍物检测的要求，本文提出了无人接地车辆局部静态环境的紧凑型矢量化表示方法。首先，通过融合LiDAR和IMU的数据，获得了高频姿势信息。然后，通过二维（2D）障碍物点的生成，提出了具有固定尺寸的网格图维护过程。最后，通过多个凸多边形描述了局部静态环境，该多边形实现了基于双阈值的边界简化和凸多边形分割。我们提出的方法已应用于公园的一个实用无人驾驶项目中，典型场景的定性实验结果验证了有效性和鲁棒性。此外，定量评估表明，与传统的基于网格地图的方法相比，使用较少的点信息（减少约60％）来代表局部静态环境。此外，运行时间（15ms）的性能表明，所提出的方法可用于实时局部静态环境感知。可以在https://github.com/ghm0819/cvr_lse上访问相应的代码。

translated by 谷歌翻译

E$ \mathbf{^3} $MoP: Efficient Motion Planning Based on Heuristic-Guided Motion Primitives Pruning and Path Optimization With Sparse-Banded Structure

Jian Wen , Xuebo Zhang , Haiming Gao , Jing Yuan , Yongchun Fang

分类：机器人

2020-12-16

为了解决复杂环境中的自主导航问题，本文新呈现了一种有效的运动规划方法。考虑到大规模，部分未知的复杂环境的挑战，精心设计了三层运动规划框架，包括全局路径规划，本地路径优化和时间最佳速度规划。与现有方法相比，这项工作的新颖性是双重的：1）提出了一种新的动作原语的启发式引导剪枝策略，并完全集成到基于国家格子的全球路径规划器中，以进一步提高图表搜索的计算效率，以及2）提出了一种新的软限制局部路径优化方法，其中充分利用底层优化问题的稀疏带系统结构以有效解决问题。我们在各种复杂的模拟场景中验证了我们方法的安全，平滑，灵活性和效率，并挑战真实世界的任务。结果表明，与最近的近期B型zier曲线的状态空间采样方法相比，全球规划阶段，计算效率提高了66.21％，而机器人的运动效率提高了22.87％。我们命名拟议的运动计划框架E $ \ mathrm {^ 3} $拖把，其中3号不仅意味着我们的方法是三层框架，而且还意味着所提出的方法是三个阶段有效。

translated by 谷歌翻译

DL-SLOT: Dynamic LiDAR SLAM and object tracking based on collaborative graph optimization

Xuebo Tian , Zhongyang Zhu , Junqiao Zhao , Gengxuan Tian , Chen Ye

分类：机器人

2022-12-05

Ego-pose estimation and dynamic object tracking are two critical problems for autonomous driving systems. The solutions to these problems are generally based on their respective assumptions, \ie{the static world assumption for simultaneous localization and mapping (SLAM) and the accurate ego-pose assumption for object tracking}. However, these assumptions are challenging to hold in dynamic road scenarios, where SLAM and object tracking become closely correlated. Therefore, we propose DL-SLOT, a dynamic LiDAR SLAM and object tracking method, to simultaneously address these two coupled problems. This method integrates the state estimations of both the autonomous vehicle and the stationary and dynamic objects in the environment into a unified optimization framework. First, we used object detection to identify all points belonging to potentially dynamic objects. Subsequently, a LiDAR odometry was conducted using the filtered point cloud. Simultaneously, we proposed a sliding window-based object association method that accurately associates objects according to the historical trajectories of tracked objects. The ego-states and those of the stationary and dynamic objects are integrated into the sliding window-based collaborative graph optimization. The stationary objects are subsequently restored from the potentially dynamic object set. Finally, a global pose-graph is implemented to eliminate the accumulated error. Experiments on KITTI datasets demonstrate that our method achieves better accuracy than SLAM and object tracking baseline methods. This confirms that solving SLAM and object tracking simultaneously is mutually advantageous, dramatically improving the robustness and accuracy of SLAM and object tracking in dynamic road scenarios.

translated by 谷歌翻译

Variance-Aware Machine Translation Test Sets

Runzhe Zhan , Xuebo Liu , Derek F. Wong , Lidia S. Chao

分类：自然语言处理

2021-11-07

我们为机器翻译（MT）评估发布了70个小鉴别的测试集，称为方差感知测试集（VAT），从WMT16覆盖了35个翻译方向到WMT20竞争。VAT由一种新颖的方差感知过滤方法自动创建，该方法会在没有任何人工的情况下过滤当前MT测试集的不分度测试实例。实验结果表明，VAT在主流语言对和测试集中与人为判断的相关性方面优于原始的WMT测试集。进一步分析增值税的性质揭示了竞争MT系统的具有挑战性的语言特征（例如，低频词和专有名词），为构建未来MT测试集提供指导。测试集和准备方差感知MT测试集的代码可在https://github.com/nlp2ct/variance-aware-mt-test-sets自由使用。

translated by 谷歌翻译

TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models

Sucheng Ren , Fangyun Wei , Zheng Zhang , Han Hu

分类：计算机视觉

2023-01-03

Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation; 2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher; 3) Weak regularization is preferred; etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on ImageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://github.com/OliverRensu/TinyMIM.

translated by 谷歌翻译

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Junjie Yan , Yingfei Liu , Jianjian Sun , Fan Jia , Shuailin Li , Tiancai Wang , Xiangyu Zhang

分类：计算机视觉

2023-01-03

In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.

translated by 谷歌翻译

Backdoor Attacks Against Dataset Distillation

Yugeng Liu , Zheng Li , Michael Backes , Yun Shen , Yang Zhang

分类：机器学习

2023-01-03

Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.

translated by 谷歌翻译

PMT-IQA: Progressive Multi-task Learning for Blind Image Quality Assessment

Qingyi Pan , Ning Guo , Letu Qingge , Jingyi Zhang , Pei Yang

分类：计算机视觉

2023-01-03

Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation, which complicate the distortion patterns crossing different scales and aggravate the difficulty of the regression problem for BIQA. However, existing BIQA methods often fail to consider multi-scale distortion patterns and image content, and little research has been done on learning strategies to make the regression model produce better performance. In this paper, we propose a simple yet effective Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT), to help the model learn complex distortion patterns and better optimize the regression issue to align with the law of human learning process from easy to hard. To verify the effectiveness of the proposed PMT-IQA model, we conduct experiments on four widely used public datasets, and the experimental results indicate that the performance of PMT-IQA is superior to the comparison approaches, and both MS and PMT modules improve the model's performance.

translated by 谷歌翻译